Improved context-dependent acoustic modeling for continuous Chinese speech recognition

نویسندگان

  • Jiyong Zhang
  • Thomas Fang Zheng
  • Jing Li
  • Chunhua Luo
  • Guoliang Zhang
چکیده

This paper describes the new framework of context-dependent (CD) Initial/Final (IF) acoustic modeling using the decision tree based state tying for continuous Chinese speech recognition. The Extended Initial/Final (XIF) set is chosen as the basic speech recognition unit (SRU) set according to the Chinese language characteristics, which outperforms the standard IF set. An adaptive mixture increasing strategy is applied when splitting the single Gaussian into mixed Gaussians in each tied state after the decision tree has been constructed. Our experimental results show that these two improvements are helpful to the acoustic modeling of Chinese speech recognition and that the CD XIF model outperforms the baseline syllable model over 30%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...

متن کامل

Allophone-based acoustic modeling for Persian phoneme recognition

Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...

متن کامل

Acoustic modeling and language modeling for cantonese LVCSR

This paper describes our recent work on the development of a large-vocabulary, speaker-independent continuous speech recognition system for Cantonese (a major Chinese dialect). Both acoustic modeling and language modeling are being addressed. For acoustic modeling, we focus on right-context-dependent sub-syllable units. Tying of HMM at model as well as state level is applied based on phonetic k...

متن کامل

Context dependent syllable acoustic model for continuous Chinese speech recognition

The choice of basic modeling unit in building acoustic model for a continuous Mandarin speech recognition task is a very important issue [1]. Unlike traditional phoneme or Initial/Finals (IFs) units based acoustic modeling methods, which usually suffer from the limitations of less accuracy in modeling intrasyllable variations and long scale temporal dependencies, in this paper, a practicable sy...

متن کامل

English Alphabet Recognition Based on Chinese Acoustic Modeling

How to effectively recognize English letters spoken by Chinese people is our major concern in the paper. Some efforts are made to build Chinese extended Initial/Final (XIF) based HMMs for English alphabet recognition which can be integrated with large vocabulary continuous Chinese speech recognition (Chinese LVCSR) system based on a same XIF set. The alphabet-specific XIF HMMs are built using c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001